42 research outputs found
Rights declaration in Linked Data
Linked Data is not always published with a license. Sometimes a wrong license type is used, like a license for software, or it is not expressed in a standard, machine readable manner. Yet, Linked Data resources may be subject to intellectual property and database laws, may contain personal data subject to privacy restrictions or may even contain important trade secrets. The proper declaration of which rights are held, waived or licensed is a must for the lawful use of Linked Data at its different granularity levels, from the simple RDF statement to a dataset or a mapping. After comparing the current practice with the actual needs, six research questions are posed
A Linked Data Platform adapter for the Bugzilla issue tracker
The W3C Linked Data Platform (LDP) specification defines
a standard HTTP-based protocol for read/write Linked Data and pro-
vides the basis for application integration using Linked Data. This paper
presents an LDP adapter for the Bugzilla issue tracker and demonstrates
how to use the LDP protocol to expose a traditional application as a
read/write Linked Data application. This approach provides a
exible
LDP adoption strategy with minimal changes to existing applications
Seven challenges for RESTful transaction models
The REpresentational State Transfer (REST) architectural style describes
the design principles that made the World Wide Web scalable
and the same principles can be applied in enterprise context to
do loosely coupled and scalable application integration. In recent
years, RESTful services are gaining traction in the industry and are
commonly used as a simpler alternative to SOAP Web Services.
However, one of the main drawbacks of RESTful services is
the lack of standard mechanisms to support advanced quality-ofservice
requirements that are common to enterprises. Transaction
processing is one of the essential features of enterprise information
systems and several transaction models have been proposed in the
past years to fulfill the gap of transaction processing in RESTful
services. The goal of this paper is to analyze the state-of-the-art
RESTful transaction models and identify the current challenges
A Coreference Service for Enterprise Application Integration using Linked Data
The use of semantic and Linked Data technologies for Enterprise Application Integration (EAI) is increasing in recent years. Linked Data and Semantic Web technologies such as the Resource Description Framework (RDF) data model provide several key advantages over the current de-facto Web Service and XML based integration approaches. The flexibility provided by representing the data in a more versatile RDF model using ontologies enables avoiding complex schema transformations and makes data more accessible using Web standards, preventing the formation of data silos. These three benefits represent an edge for Linked Data-based EAI. However, work still has to be performed so that these technologies can cope with the particularities of the EAI scenarios in different terms, such as data control, ownership, consistency, or
accuracy.
The first part of the paper provides an introduction to Enterprise Application Integration using Linked Data and the requirements imposed by EAI to Linked Data technologies focusing on one of the problems that arise in this scenario, the coreference problem, and presents a coreference service that supports the use of Linked Data in EAI systems. The proposed solution introduces the use of a context that aggregates a set of related identities and mappings from the identities to different resources that reside in distinct applications and provide different views or aspects of the same entity. A detailed architecture of the Coreference Service is presented explaining how it can be used to manage the contexts, identities, resources, and applications which
they relate to. The paper shows how the proposed service can be utilized in an EAI scenario using an example involving a dashboard that integrates data from different systems and the proposed workflow for registering and resolving identities. As most enterprise applications are driven by business processes and involve legacy data, the proposed approach can be easily incorporated into enterprise applications
Text2KGBench: A Benchmark for Ontology-Driven Knowledge Graph Generation from Text
The recent advances in large language models (LLM) and foundation models with
emergent capabilities have been shown to improve the performance of many NLP
tasks. LLMs and Knowledge Graphs (KG) can complement each other such that LLMs
can be used for KG construction or completion while existing KGs can be used
for different tasks such as making LLM outputs explainable or fact-checking in
Neuro-Symbolic manner. In this paper, we present Text2KGBench, a benchmark to
evaluate the capabilities of language models to generate KGs from natural
language text guided by an ontology. Given an input ontology and a set of
sentences, the task is to extract facts from the text while complying with the
given ontology (concepts, relations, domain/range constraints) and being
faithful to the input sentences. We provide two datasets (i) Wikidata-TekGen
with 10 ontologies and 13,474 sentences and (ii) DBpedia-WebNLG with 19
ontologies and 4,860 sentences. We define seven evaluation metrics to measure
fact extraction performance, ontology conformance, and hallucinations by LLMs.
Furthermore, we provide results for two baseline models, Vicuna-13B and
Alpaca-LoRA-13B using automatic prompt generation from test cases. The baseline
results show that there is room for improvement using both Semantic Web and
Natural Language Processing techniques.Comment: 15 pages, 3 figures, 4 tables. Accepted at ISWC 2023 (Resources
Track
KGI: An Integrated Framework for Knowledge Intensive Language Tasks
In a recent work, we presented a novel state-of-the-art approach to zero-shot
slot filling that extends dense passage retrieval with hard negatives and
robust training procedures for retrieval augmented generation models. In this
paper, we propose a system based on an enhanced version of this approach where
we train task specific models for other knowledge intensive language tasks,
such as open domain question answering (QA), dialogue and fact checking. Our
system achieves results comparable to the best models in the KILT leaderboards.
Moreover, given a user query, we show how the output from these different
models can be combined to cross-examine each other. Particularly, we show how
accuracy in dialogue can be improved using the QA model. A short video
demonstrating the system is available here -
\url{https://ibm.box.com/v/kgi-interactive-demo}
A Quality Assessment Approach for Evolving Knowledge Bases
Knowledge bases are nowadays essential components for any task that requires automation with some degrees of intelligence.Assessing the quality of a Knowledge Base (KB) is a complex task as it often means measuring the quality of structured information, ontologies and vocabularies, and queryable endpoints. Popular knowledge bases such as DBpedia, YAGO2, and Wikidata have chosen the RDF data model to represent their data due to its capabilities for semantically rich knowledge representation. Despite its advantages, there are challenges in using RDF data model, for example, data quality assessment and validation. In thispaper, we present a novel knowledge base quality assessment approach that relies on evolution analysis. The proposed approachuses data profiling on consecutive knowledge base releases to compute quality measures that allow detecting quality issues. Our quality characteristics are based on the KB evolution analysis and we used high-level change detection for measurement functions. In particular, we propose four quality characteristics: Persistency, Historical Persistency, Consistency, and Completeness.Persistency and historical persistency measures concern the degree of changes and lifespan of any entity type. Consistency andcompleteness measures identify properties with incomplete information and contradictory facts. The approach has been assessed both quantitatively and qualitatively on a series of releases from two knowledge bases, eleven releases of DBpedia and eight releases of 3cixty. The capability of Persistency and Consistency characteristics to detect quality issues varies significantly between the two case studies. Persistency measure gives observational results for evolving KBs. It is highly effective in case of KBwith periodic updates such as 3cixty KB. The Completeness characteristic is extremely effective and was able to achieve 95%precision in error detection for both use cases. The measures are based on simple statistical operations that make the solution both flexible and scalabl
Hypernym Detection Using Strict Partial Order Networks
This paper introduces Strict Partial Order Networks (SPON), a novel neural
network architecture designed to enforce asymmetry and transitive properties as
soft constraints. We apply it to induce hypernymy relations by training with
is-a pairs. We also present an augmented variant of SPON that can generalize
type information learned for in-vocabulary terms to previously unseen ones. An
extensive evaluation over eleven benchmarks across different tasks shows that
SPON consistently either outperforms or attains the state of the art on all but
one of these benchmarks.Comment: 8 page
morph-LDP: an R2RML-based Linked Data Platform implementation
The W3C Linked Data Platform (LDP) candidate recom-
mendation defines a standard HTTP-based protocol for read/write Linked
Data. The W3C R2RML recommendation defines a language to map re-
lational databases (RDBs) and RDF. This paper presents morph-LDP,
a novel system that combines these two W3C standardization initiatives
to expose relational data as read/write Linked Data for LDP-aware ap-
plications, whilst allowing legacy applications to continue using their
relational databases
Completeness and Consistency Analysis for Evolving Knowledge Bases
Assessing the quality of an evolving knowledge base is a challenging task as
it often requires to identify correct quality assessment procedures.
Since data is often derived from autonomous, and increasingly large data
sources, it is impractical to manually curate the data, and challenging to
continuously and automatically assess their quality.
In this paper, we explore two main areas of quality assessment related to
evolving knowledge bases: (i) identification of completeness issues using
knowledge base evolution analysis, and (ii) identification of consistency
issues based on integrity constraints, such as minimum and maximum cardinality,
and range constraints.
For completeness analysis, we use data profiling information from consecutive
knowledge base releases to estimate completeness measures that allow predicting
quality issues. Then, we perform consistency checks to validate the results of
the completeness analysis using integrity constraints and learning models.
The approach has been tested both quantitatively and qualitatively by using a
subset of datasets from both DBpedia and 3cixty knowledge bases. The
performance of the approach is evaluated using precision, recall, and F1 score.
From completeness analysis, we observe a 94% precision for the English DBpedia
KB and 95% precision for the 3cixty Nice KB. We also assessed the performance
of our consistency analysis by using five learning models over three sub-tasks,
namely minimum cardinality, maximum cardinality, and range constraint. We
observed that the best performing model in our experimental setup is the Random
Forest, reaching an F1 score greater than 90% for minimum and maximum
cardinality and 84% for range constraints.Comment: Accepted for Journal of Web Semantic